35 research outputs found

    Analyzing Clustered Latent Dirichlet Allocation

    Get PDF
    Dynamic Topic Models (DTM) are a way to extract time-variant information from a collection of documents. The only available implementation of this is slow, taking days to process a corpus of 533,588 documents. In order to see how topics - both their key words and their proportional size in all documents - change over time, we analyze Clustered Latent Dirichlet Allocation (CLDA) as an alternative to DTM. This algorithm is based on existing parallel components, using Latent Dirichlet Allocation (LDA) to extract topics at local times, and k-means clustering to combine topics from dierent time periods. This method is two orders of magnitude faster than DTM, and allows for more freedom of experiment design. Results show that most topics generated by this algorithm are similar to those generated by DTM at both the local and global level using the Jaccard index and Sørensen-Dice coecient, and that this method\u27s perplexity compares favorably to DTM. We also explore tradeos in CLDA method parameters

    Automated Cluster Provisioning And Workflow Management for Parallel Scientific Applications in the Cloud

    Get PDF
    Many commercial cloud providers and tools are available that researchers could utilize to advance computational science research. However, adoption by the research community has been slow. In this paper we describe the automated Pro-visioning And Workflow (PAW) management tool for parallel scientific applications in the cloud. PAW is a comprehensive resource provisioning and workflow tool that automates the steps of dynamically provisioning a large scale cluster environment in the cloud, executing a set of jobs or a custom workflow and, after the jobs have completed, de-provisioning the cluster environment in a single operation. A key characteristic of PAW is that it separates the provisioning of cluster resources in the cloud from the management of scientific workflow on these resources, which enables fine-grained decisions about performance and cost trade-offs in a commercial cloud environment. This paper describes our initial AWS implementation of PAW for executing a large parameter sweep workflow. We demonstrate this using an MPI-based topic modeling application. PAW provides a standardized, simplified, and pluggable interface that can easily be expanded to support a variety of underlying cloud or cluster hardware environments, user-facing scheduling systems, workflows, and scientific applications

    Random Access in Nondelimited Variable-length Record Collections for Parallel Reading with Hadoop

    Get PDF
    The industry standard Packet CAPture (PCAP) format for storing network packet traces is normally only readable in serial due to its lack of delimiters, indexing, or blocking. This presents a challenge for parallel analysis of large networks, where packet traces can be many gigabytes in size. In this work we present RAPCAP, a novel method for random access into variable-length record collections like PCAP by identifying a record boundary within a small number of bytes of the access point. Unlike related heuristic methods that can limit scalability with a nonzero probability of error, the new method offers a correctness guarantee with a well formed file and does not rely on prior knowledge of the contents. We include a practical implementation of the algorithm with an extension to the Hadoop framework, and a performance comparison to serial ingestion. Finally, we present a number of similar storage types that could utilize a modified version of RAPCAP for random access

    Effects of Hypothalamic Neurodegeneration on Energy Balance

    Get PDF
    Normal aging in humans and rodents is accompanied by a progressive increase in adiposity. To investigate the role of hypothalamic neuronal circuits in this process, we used a Cre-lox strategy to create mice with specific and progressive degeneration of hypothalamic neurons that express agouti-related protein (Agrp) or proopiomelanocortin (Pomc), neuropeptides that promote positive or negative energy balance, respectively, through their opposing effects on melanocortin receptor signaling. In previous studies, Pomc mutant mice became obese, but Agrp mutant mice were surprisingly normal, suggesting potential compensation by neuronal circuits or genetic redundancy. Here we find that Pomc-ablation mice develop obesity similar to that described for Pomc knockout mice, but also exhibit defects in compensatory hyperphagia similar to what occurs during normal aging. Agrp-ablation female mice exhibit reduced adiposity with normal compensatory hyperphagia, while animals ablated for both Pomc and Agrp neurons exhibit an additive interaction phenotype. These findings provide new insight into the roles of hypothalamic neurons in energy balance regulation, and provide a model for understanding defects in human energy balance associated with neurodegeneration and aging

    Maternal overnutrition programs epigenetic changes in the regulatory regions of hypothalamic Pomc in the offspring of rats.

    Get PDF
    Maternal overnutrition has been implicated in affecting the offspring by programming metabolic disorders such as obesity and diabetes, by mechanisms that are not clearly understood. This study aimed to determine the long-term impact of maternal high-fat (HF) diet feeding on epigenetic changes in the offspring's hypothalamic Pomc gene, coding a key factor in the control of energy balance. Further, it aimed to study the additional effects of postnatal overnutrition on epigenetic programming by maternal nutrition.Eight-week-old female Sprague-Dawley rats were fed HF diet or low-fat (LF) diet for 6 weeks before mating, and throughout gestation and lactation. At postnatal day 21, samples were collected from a third offspring and the remainder were weaned onto LF diet for 5 weeks, after which they were either fed LF or HF diet for 12 weeks, resulting in four groups of offspring differing by their maternal and postweaning diet.With maternal HF diet, offspring at weaning had rapid early weight gain, increased adiposity, and hyperleptinemia. The programmed adult offspring, subsequently fed LF diet, retained the increased body weight. Maternal HF diet combined with offspring HF diet caused more pronounced hyperphagia, fat mass, and insulin resistance. The ARC Pomc gene from programmed offspring at weaning showed hypermethylation in the enhancer (nPE1 and nPE2) regions and in the promoter sequence mediating leptin effects. Interestingly, hypermethylation at the Pomc promoter but not at the enhancer region persisted long term into adulthood in the programmed offspring. However, there were no additive effects on methylation levels in the regulatory regions of Pomc in programmed offspring fed a HF diet.Maternal overnutrition programs long-term epigenetic alterations in the offspring's hypothalamic Pomc promoter. This predisposes the offspring to metabolic disorders later in life
    corecore